Text Readability Assessment for Second Language Learners
نویسندگان
چکیده
This paper addresses the task of readability assessment for the texts aimed at second language (L2) learners. One of the major challenges in this task is the lack of significantly sized level-annotated data. For the present work, we collected a dataset of CEFR-graded texts tailored for learners of English as an L2 and investigated text readability assessment for both native and L2 learners. We applied a generalization method to adapt models trained on larger native corpora to estimate text readability for learners, and explored domain adaptation and self-learning techniques to make use of the native data to improve system performance on the limited L2 data. In our experiments, the best performing model for readability on learner texts achieves an accuracy of 0.797 and PCC of 0.938.
منابع مشابه
Qualitative and Quantitative Examination of Text Type Readabilities: A Comparative Analysis
This study compared 2 main approaches to readability assessment. Thequantitative approach applied idea density based on part of speech tagging andcompared 3 sets of text types (i.e., narrative, expository, and argumentative) withrespect to their ease of reading. The qualitative approach was done throughdeveloping questionnaires measuring intermediate EFL learners’ perceptions oncontent, motivat...
متن کاملWeb Readability and Computer-Assisted Language Learning
Proficiency in a second language is of vital importance for many people. Today’s access to corpora of text, including the Web, allows new techniques for improving language skill. Our project’s aim is the development of techniques for presenting the user with suitable web text, to allow optimal language acquisition via reading. Some text found on the Web may be of a suitable level of difficulty ...
متن کاملOn Improving the Accuracy of Readability Classification using Insights from Second Language Acquisition
We investigate the problem of readability assessment using a range of lexical and syntactic features and study their impact on predicting the grade level of texts. As empirical basis, we combined two web-based text sources, Weekly Reader and BBC Bitesize, targeting different age groups, to cover a broad range of school grades. On the conceptual side, we explore the use of lexical and syntactic ...
متن کاملDeveloping EFL Learners' Oral Proficiency through Animation-based Instruction of English Formulaic Sequences
The current pretest-posttest quasi-experimental study attempts, firstly, to probe the effects of teaching formulaic sequences (FSs) on the second or foreign language (L2) learners' oral proficiency improvement and secondly, to examine whether teaching FSs through different resources (i.e. animation vs. text-based readings) have any differentially influential effects in augmenting L2 l...
متن کاملA Machine Learning Approach to Measurement of Text Readability for EFL Learners Using Various Linguistic Features
The present paper introduces and evaluates a readability measurement method designed for learners of EFL (English as a foreign language). The proposed readability measurement method (a regression model) estimates the text readability based on linguistic features, such as lexical, syntactic and discourse features. Text readability refers to the comprehension rate of a text (0.0-1.0). The experim...
متن کامل